SVitchboard 1: Small Vocabulary Tasks from Switchboard 1
نویسندگان
چکیده
We present a conversational telephone speech data set designed to support research on novel acoustic models. Small vocabulary tasks from 10 words up to 500 words are defined using subsets of the Switchboard-1 corpus; each task has a completely closed vocabulary (an OOV rate of 0%). We justify the need for these tasks, describe the algorithm for selecting them from a large corpus, give a statistical analysis of the data and present baseline whole-word hidden Markov model recognition results. The goal of the paper is to define a common data set and to encourage other researchers to use it.
منابع مشابه
SVitchboard 1: Small Vocabulary Ta
We present a conversational telephone speech data set designed to support research on novel acoustic models. Small vocabulary tasks from 10 words up to 500 words are defined using subsets of the Switchboard-1 corpus; each task has a completely closed vocabulary (an OOV rate of 0%). We justify the need for these tasks, describe the algorithm for selecting them from a large corpus, give a statist...
متن کاملSVitchboard II and fiSVer i: high-quality limited-complexity corpora of conversational English speech
In this paper, we introduce a set of benchmark corpora of conversational English speech derived from the Switchboard-I and Fisher datasets. Traditional ASR research requires considerable computational resources and has slow experimental turnaround times. Our goal is to introduce these new datasets to researchers in the ASR and machine learning communities (especially in academia), in order to f...
متن کاملResegmentation of SWITCHBOARD
The SWITCHBOARD (SWB) corpus is one of the most important benchmarks for recognition tasks involving large vocabulary conversational speech (LVCSR). The high error rates on SWB are largely attributable to an acoustic model mismatch, the high frequency of poorly articulated monosyllabic words, and large variations in pronunciations. It is imperative to improve the quality of segmentations and tr...
متن کاملProsodically-enhanced recurrent neural network language models
Recurrent neural network language models have been shown to consistently reduce the word error rates (WERs) of large vocabulary speech recognition tasks. In this work we propose to enhance the RNNLMs with prosodic features computed using the context of the current word. Since it is plausible to compute the prosody features at the word and syllable level we have trained the models on prosody fea...
متن کاملFactorized Deep Neural Networks for Adaptive Speech Recognition
Recently, we have shown that context-dependent deep neural network hidden Markov models (CD-DNN-HMMs) can achieve very promising recognition results on large vocabulary speech recognition tasks, as evidenced by over one third fewer word errors than the discriminatively trained conventional HMM-based systems on the 300hr Switchboard benchmark task. In this paper, we propose and describe two type...
متن کامل